Running head: REVISITING THE WIDGET EFFECT Revisiting The Widget Effect: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness

نویسندگان

  • Matthew A. Kraft
  • Allison F. Gilmour
چکیده

In 2009, TNTP’s The Widget Effect documented the failure to recognize and act on differences in teacher effectiveness. We revisit these findings by compiling teacher performance ratings across 24 states that adopted major reforms to their teacher evaluation systems. In the vast majority of these states, the percentage of teachers rated Unsatisfactory remains less than 1%. However, the full distributions of ratings vary widely across states with 0.7% to 26% rated below Proficient and 3% to 62% rated above Proficient. We present original survey data from an urban district illustrating that evaluators perceive more than three times as many teachers in their schools as below Proficient than they rate as such. Interviews with principals reveal several potential explanations for these patterns. REVISITING THE WIDGET EFFECT 2 Revisiting The Widget Effect: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness The failure of evaluation systems to provide accurate and credible information about individual teachers’ instructional performance sustains and reinforces a phenomenon that we have come to call the Widget Effect. The Widget Effect describes the tendency of school districts to assume classroom effectiveness is the same from teacher to teacher. This decades-old fallacy fosters an environment in which teachers cease to be understood as individual professionals, but rather as interchangeable parts. The New Teacher Project, 2009 In 2009, The New Teacher Project (TNTP) characterized the failure of U.S. public education to recognize and respond to differences in teacher effectiveness as the “Widget Effect” (Weisberg et al., 2009). The study highlighted the discrepancy between formal teacher evaluation ratings and perceptions about the actual distribution of teacher effectiveness. The authors found that, in most districts, less than 1% of teachers were rated as Unsatisfactory, but 81% of administrators and 57% of teachers could identify a teacher in their school who was ineffective. The Widget Effect was not the first or only study to draw attention to the failure to differentiate among teachers. Over a decade earlier, Tucker (1997) labeled the U.S. education system’s failure to recognize “incompetent” teaching as the “Lake Wobegon Effect” – referring to Garrison Keillor’s fictitious town where “all the children are above average.” Several studies also characterized teacher evaluation as a superficial exercise that failed to assess instructional quality or to inform teacher professional development and personnel decisions (Donaldson, 2009; Toch & Rothman, 2008). Growing recognition of the broken teacher evaluation system amplified by new research documenting the importance of teacher effectiveness (e.g. Rockoff, 2005; Rivkin, Hanushek, & Kain, 2005) helped to generate momentum for evaluation reforms (Donaldson & Papay, 2015). The U.S. Department of Education’s Race to the Top (RTTT) competition and state waivers for REVISITING THE WIDGET EFFECT 3 regulations in the No Child Left Behind Act created strong incentives for states to make sweeping changes to these systems. Applicants were required to replace binary checklists with systems that included multiple rating categories and differentiated teachers by performance (U.S. DOE, 2009; 2012). The combination of these initiatives along with local reform efforts led to substantial changes in teacher evaluation. Today, almost every state has designed and adopted new teacher evaluation systems (see Steinberg & Donaldson [in press] for a survey of reform efforts and Donaldson & Papay [2015] for a summary of new evaluation systems features). Some scholars view this focus on highstakes evaluation systems as misplaced (Fullen, 2011; Hallinger, Heck, & Murphy, 2014; Metha & Fine, 2015). Even those who see evaluation reforms as promising do not agree on how these systems should be used to improve the teacher workforce. One camp of scholars (Hanushek, 2009) and journalists (Thomas, Wingert, Conant, & Register, 2010) emphasize the importance of differentiating among teachers in order to motivate them through performance incentives and to dismiss those judged to be low performing. Others see evaluation as central to supporting teachers’ professional growth by providing teachers with individualized feedback and identifying areas for targeted professional support (Almy, 2011; Curtis & Weiner, 2012; Papay, 2012). Both of these theories of action require an evaluation system that differentiates among teachers and accurately assesses the quality of their instruction. In this paper, we revisit The Widget Effect by examining the degree to which new teacher evaluation systems differentiate among teachers. Research on evaluation reforms has primarily focused on the properties of performance measures (e.g. Grossman, Loeb, Cohen, & Wyckoff, 2013, Kane, McCaffrey, Miller, & Staiger, 2013, and the March 2015 special issue of Educational Researcher), the effect evaluation systems have on teacher satisfaction (Koedel, Li, REVISITING THE WIDGET EFFECT 4 & Springer, 2015) and student achievement (Dee & Wyckoff, 2015; Steinberg & Sartain, 2015; Taylor & Tyler, 2013), and principals’ use of value-added measures (Goldring et al., 2015; Rockoff, Staiger, Kane, & Taylor, 2012). Research suggests that principals are capable of distinguishing between low and high performing teachers (Harris & Sass, 2014; Jacob & Lefgren, 2008), but that they do not always do so on high-stakes evaluation ratings (Grissom & Loeb, forthcoming). We have little evidence about the degree to which these reforms have fundamentally changed the distribution of teacher performance ratings. Policymakers have assumed that the sweeping changes to evaluation system features would result in greater differentiation, ignoring Lipsky’s (1980) seminal observation that policies are ultimately made by the “street-level bureaucrats” who implement them. History shows that the success of policy initiatives depends on the will and capacity of local actors to implement reforms (Honig, 2006). This is particularly true in the decentralized U.S. education system where local practice is often decoupled from central policy (Spillane & Kenney, 2012). Guided by this lens, we ask: What is the distribution of teacher performance ratings in states that have adopted reforms to their teacher evaluation systems? Does the distribution of teacher performance ratings reflect principals’ perceptions about the distribution of teacher effectiveness? And, if not, what are principals’ explanations for why teacher evaluation reforms have not resulted in greater differentiation in performance ratings? We examine these questions with quantitative and qualitative data collected over the course of three years. We begin by presenting data on the distribution of teacher evaluation ratings across 24 states that have implemented teacher evaluation reforms with multiple performance categories. We complement these state-level data with a case study of the REVISITING THE WIDGET EFFECT 5 distribution of teacher evaluation ratings in one large urban school district. Specifically, we leverage original survey data linked to evaluation records to compare evaluators’ perceptions of the distribution of teacher effectiveness with both their predictions and actual ratings. We then discuss findings from in-depth interviews with a random sample of principals in the district that help to explain why differences existed between evaluators’ perceptions, predictions, and actual performance ratings. Throughout the paper we focus much of our analyses and discussion on the percentage of performance ratings below and above Proficient given the high-stakes incentives and consequences attached to these ratings in many districts (e.g. Dee & Wyckoff, 2015). Together, these data provide new insights about the potential and pitfalls of improving the quality of the teacher workforce through teacher evaluation reforms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Energy-GDP Nexus for the Selected Countries of the Middle East Region

This paper investigates the relationship between total energy consumption and GDP in six countries of the Middle East , including Iran,Pakistan,Saudi Arabia,Oman,Bahrain and the United Arab Emirates. The data are annual and spanning the period 1980-2012.We employed Hsiao’s (1981) methodology to examine causality relation between total energy consumption and GDP.The empirical findings show a uni...

متن کامل

The effect of peer and teacher scaffolding on the reading comprehension of EFL learners in asymmetrical and symmetrical groups

AbstractIn the present study, attempt has been made to examine the effectiveness of peer and teacher scaffolding in reading comprehension of intermediate EFL students in symmetrical and asymmetrical groups. To do so, sixty intermediate students were purposively selected out of 150 intermediate students in Hamadan Islamic Azad University and Kish Language Institute in Hamadan. They were divided ...

متن کامل

The Sensitivity of Teacher Performance Ratings

Recent policy reforms have dramatically changed how educators are evaluated. To better differentiate teacher effectiveness, newly implemented teacher evaluation systems incorporate multiple teacher performance measures and performance rating categories. Yet, little evidence exists on the consequences of these system design features for the distribution of teacher effectiveness. Using data from ...

متن کامل

Evaluating and revisiting the public government: a policy feedback perspective

Existing models of quality evaluation in public sector are mainly derived from the approaches within private sector. These models which represented by the New Public Management (NPM) paradigm, don&rsquot reflect the fundamental values of public administration, which are the very essence of the discipline and contrasting it from private sector. These core values, such as democratic citizenship, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016